Summary

All things are lawful for me, but not all things are helpful; all things are lawful for me, but not all things edify. (1 Cor 10:23)

  1. Marginal and conditional effects are about what you do with your model after it has been fit. They are intrinsically interpretive, which means they depend both on understanding your model as a model and understanding your model as a representation of the world. This understanding is the difference between using and abusing marginal/conditional effects.

  2. For all but the simplest cases, data visualization alone is inadequate, and you need a model.

  3. For all but the simplest models, model parameters alone are hard to interpret, and you need to report and visualize your model in terms of marginal/conditional effects.

  4. The terminology of marginal and conditional effects is bewilderingly inconsistent. Each alternative definition is useful and legitimate, but each is incompatible with the others. Barring actual coding errors, there is no such thing as a “wrong” marginal/conditional effect, but there are wrongly interpreted effects, and poorly chosen effects.

  5. The best way to sort all this out is to read this online book and use the accompanying R package marginaleffects.

As a disclaimer, please note that throughout this demonstration, I will be ignoring completely the issue of causal inference, which is the most important thing about any statistical analysis (at least in ecology). The point is just to illustrate the behavior of models.

A conversation

Dear Andrew,

First of all, thank you so much for all the work you put into your blog. It’s the best resource I have found for the sort of modeling questions I encounter in my work.

If I may, I have a question concerning the definitions of “marginal” vs. “conditional” effects. I have encountered what would appear to be at least three alternative definitions:

  1. In your post “Marginalia”, you use the “slider” vs. “switch” analogy; marginal effects are partial derivatives for continuous variables, while conditional effects are discrete “deltas” associated with factor levels. Fair enough — a good, clean definition.

  2. In your more recent post, you discuss how the terms marginal and conditional take on a different meaning in the context of hierarchical modeling. If I understand correctly, conditional effects are based solely on the fixed terms of the model, while marginal effects (somehow) incorporate the uncertainty arising from the random effects.

  3. In the documentation of ggeffects, Daniel Lüdecke offers what would seem to be yet another definition: “[The] effects returned by ggpredict() can be described as conditional effects (i.e. these are conditioned on certain levels of factors), while ggemmeans() and ggeffect() return marginal means, since the effects are”marginalized” (or “averaged”) over the levels of factors.”

I’d be very grateful for any light you can shed on this.

Best,

Doug


Hello!

Unfortunately all three definitions are true 😭

So marginal effects can refer to little changes in a slider (partial derivatives, or differential calculus), and also to “marginalizing” and averaging across variables and finding marginal means, like the {ggeffects} definition (averages, or integral calculus). The same word means opposite concepts(!). PLUS in the multilevel model paradigm, it refers to dealing with (or not dealing with) random effects.

It’s all a horrible mess.

Andrew Heiss

Packages

# Data
library(palmerpenguins)

# Frequentist modeling and visualization
library(lme4)
library(mgcv)
library(ggeffects)

# Post-hoc analyses
library(emmeans)
library(marginaleffects)

# Handling and visualization
library(tidyverse)
library(ggthemes)
library(tidymodels)
library(modelsummary)
library(modelr)
library(see)

An introduction to the data

You might be wondering what the palmerpenguins dataset is about. If you guessed penguins, you’re right. After reading the data in from the palmerpenguins package, we have a data frame called penguins. We need to make one quick modification. In the original data, the variable year* is coded as an integer column. We want to change year to a factor column so that we can treat it as a discrete rather than continuous variable, and we can do this in one line with a call to dplyr::mutate.

data("penguins", package = "palmerpenguins") # read in data from package 

penguins <- penguins %>%
  mutate(year = factor(year)) # convert `year` to factor
Artwork by @allison_horst
Artwork by @allison_horst
datasummary_skim(penguins, type = "categorical")
N %
species Adelie 152 44.2
Chinstrap 68 19.8
Gentoo 124 36.0
island Biscoe 168 48.8
Dream 124 36.0
Torgersen 52 15.1
sex female 165 48.0
male 168 48.8
year 2007 110 32.0
2008 114 33.1
2009 120 34.9
Artwork by @allison_horst
Artwork by @allison_horst
datasummary_skim(penguins, type = "numeric")
Unique (#) Missing (%) Mean SD Min Median Max
bill_length_mm 165 1 43.9 5.5 32.1 44.5 59.6
bill_depth_mm 81 1 17.2 2.0 13.1 17.3 21.5
flipper_length_mm 56 1 200.9 14.1 172.0 197.0 231.0
body_mass_g 95 1 4201.8 802.0 2700.0 4050.0 6300.0

When data visualization is enough

Before we talk about models, I think it is worth mentioning that we don’t always need them. Let’s say someone has claimed that the sex ratios of penguins changed dramatically in response to the 2008 financial crisis. In this case, there is nothing a model can tell you that isn’t already obvious in a simple visualization of the data, and it would be silly to try fitting a binomial regression. This is a trivial example, but there are real questions in ecology that are just as trivial.

ggplot(filter(penguins, !is.na(sex)), aes(year, fill = sex)) +
  geom_bar(position = "fill") +
  scale_fill_solarized() +
  facet_wrap(~species) +
  theme_lucid()

When data visualization is not enough

In all but the simplest cases, though, merely visualizing the data is insufficient and potentially misleading.

Let’s say we are interested in understanding bill depth as a linear function of body mass. We’re finally going to settle the classic question of whether big birds have big beaks. So, we plot bill depth ~ body mass and add a convenient best-fit line with geom_smooth. Lo and behold, there would appear to be an exciting, counter-intuitive finding. Write it up for Nature!

ggplot(penguins, aes(body_mass_g, bill_depth_mm)) +
  geom_point() +
  geom_smooth(method = "lm") +
  theme_lucid()

Again, this is a crude example, and it would be obvious to anybody who doesn’t work for Morgan Stanley that there is something very wrong with that smooth line. But much subtler versions of this problem happen all the time in ecology. In this particular case, we can make the data visualization much less misleading simply by adding species to the plot.

ggplot(penguins, aes(body_mass_g, bill_depth_mm, color = species)) +
  geom_point() +
  geom_smooth(method = "lm") +
  scale_color_solarized() +
  theme_lucid()

But what if flipper length also has something to do with bill depth? This becomes a harder problem to solve with mere visualization. Adding an alpha aesthetic doesn’t help much. We need a model.

ggplot(penguins, aes(body_mass_g, bill_depth_mm, color = species, alpha = flipper_length_mm)) +
  geom_point() +
  geom_smooth(method = "lm") +
  scale_color_solarized() +
  theme_lucid()

Marginal and conditional effects in multiple regression models

Understanding the model

Consider the model below.

mod.01 <- lm(bill_depth_mm ~ body_mass_g + flipper_length_mm + species,
             data = penguins)

It’s one way to express the question we just asked: how does bill depth relate to body mass, flipper length, and species? When we specify a linear model like this, we are saying that the bill depth of any given penguin \(i\) falls within a normal distribution around a mean of \(\mu\) with a standard deviation of \(\sigma\).

\[ bill.depth \sim \mathcal{N}(\mu, \sigma) \] \(\mu\) is called the linear predictor, because it is a linear function that predicts the mean of the response variable. In our model, \(\mu\) takes the following form:

\[ \mu = \beta_0 + \beta_1body.mass + \beta_2flipper.length + \beta_3species.Chinstrap + \beta_4species.Gentoo \]

To be honest, this is about where my mathematical understanding of linear regression ends. There are magical algorithms that use things like calculus and linear algebra to estimate each of the \(\beta\) parameters (along with the standard deviation \(\sigma\)) from the data. Rather than going into how they are calculated, let’s focus on what the parameters mean.

\(\beta_0\): The value of the response variable when body mass and flipper length equal zero and species equals the reference level of Adelie. If you’re trying to imagine an Adelie penguin with zero body mass, you’ve already noticed an inherent limitation of linear modeling. But that is what the parameter as such means. As long as we do not try to use the model to predict outcomes for unrealistic values of the predictor variables, the model can still be useful. This is a helpful reminder that every model can be reduced to absurdity, because models of the world are not the world. Anyway, moving along…

\(\beta_1\): The slope of bill depth body mass, conditional on flipper length being set to its mean and species being set to its reference level (Adelie) — but in a model without interactions, the assumption is that this slope is constant across all covariate values. This slope is — drum roll — the marginal effect of body mass on bill depth.

\(\beta_2\): The slope of bill depth ~ flipper length, conditional on body mass being set to its mean and species being set to its reference level (Adelie). Again, in a model without interactions, the assumption is that this slope is constant across all covariate values. This slope is the marginal effect of flipper length on bill depth.

\(\beta_3\): The change in bill depth when you go from species = Adelie to species = Chinstrap, conditional on flipper length and body size being set to their means. This difference is the conditional effect of species = Chinstrap on bill depth.

\(\beta_4\): The change in bill depth when you go from species = Adelie to species = Gentoo, conditional on flipper length and body size being set to their means. This difference is the conditional effect of species = Gentoo on bill depth.

Here we finally encounter the terms that this R Club is all about: marginal and conditional effects. A marginal effect is the slope of the response variable in relation to a continuous predictor (conditional on all covariates), while a conditional effect is the discrete change of the response variable in relation to a categorical variable (again, conditional on all covariates).

In ecology, these \(\beta\) parameters become the focus of our interpretation, which is almost invariably causal. As I said, we are not focusing on causal inference in this workshop, but when we use linear regression in our research, we interpret marginal and conditional effects as causal effects, meaning that intervening in the system to change, say, body mass, would cause bill depth to change at a slope of \(\beta_1\).

Marginal and conditional effects

Let’s take a look at the parameter estimates that the mod.01 gives us. We’ll use tidymodels to extract the model estimates into a nice clean tibble, but the same information can be obtained with summary(). The estimate column contains our \(\beta\) parameters, which, as we discussed above, are the marginal and conditional effects of each of our predictor variables. We can calculate 95% confidence intervals for each estimate by adding +/- 2x the standard error. For convenience, we’ll also make a column that distinguishes marginal effects, conditional effects, and the intercept.

mod.01_estimates <- tidy(mod.01) %>%
  mutate(upper.95 = estimate + 2*std.error,
         lower.95 = estimate + -2*std.error) %>%
  mutate(class = case_when(
    term == "(Intercept)" ~ "Intercept",
    term %in% c("body_mass_g", "flipper_length_mm") ~ "Marginal effects",
    term %in% c("speciesChinstrap", "speciesGentoo") ~ "Conditional effects"
  ))

mod.01_estimates
## # A tibble: 5 × 8
##   term             estimate std.error statistic  p.value upper.95 lower.95 class
##   <chr>               <dbl>     <dbl>     <dbl>    <dbl>    <dbl>    <dbl> <chr>
## 1 (Intercept)       7.72     1.43          5.39 1.36e- 7 10.6      4.86e+0 Inte…
## 2 body_mass_g       0.00124  0.000125      9.93 1.45e-20  0.00149  9.92e-4 Marg…
## 3 flipper_length_…  0.0317   0.00870       3.65 3.09e- 4  0.0491   1.43e-2 Marg…
## 4 speciesChinstrap -0.152    0.135        -1.13 2.61e- 1  0.118   -4.23e-1 Cond…
## 5 speciesGentoo    -5.94     0.221       -26.8  1.54e-85 -5.49    -6.38e+0 Cond…

Let’s visualize these marginal and conditional effects.

ggplot(filter(mod.01_estimates, class != "Intercept"), 
       aes(term, estimate, 
           ymin = lower.95, 
           ymax = upper.95)) +
  geom_point() +
  geom_errorbar(width = 0.25) +
  facet_wrap(~class, scales = "free") +
  theme_lucid()

The conditional effect of species = Chinstrap (\(\beta_3\)) is negative but close to zero, with it’s 95% confidence interval including positive values. At alpha = 0.05, then, we would all this effect non-significant. All else held equal, Adelie and Chinstrap penguins have similar bill depth.

The conditional effect of species = Gentoo (\(\beta_4\)), however, is around -6, with a 95% confidence interval extending from around 5.5 to around 6.4. This means that bill depth decreases by about 6 mm when you go from species = Adelie to species = Gentoo.

The marginal effect of body size (\(\beta_1\)) is around 0.001, but with a very tight confidence interval that makes it significantly positive at alpha = 0.05. Remember, the units of the predictor will determine the scale of the \(\beta\) coefficient. Since we’re measuring body mass in grams, it is not surprising that the slope of bill depth ~ body mass is small. It might make sense to rescale body mass to kg units, but we’ll leave it as it is for now. For every 1 g increase in body mass, we expect a 0.001 mm increase in bill depth.

The marginal effect of flipper length (\(\beta_2\)) is around 0.03, significantly positive at alpha = 0.05. This means that for ever 1 mm increase in flipper length (again, we’re dealing with small units here), we expect a 0.03 mm increase in bill depth.

Marginal and conditional means

In this relatively simple model, it is fairly easy to interpret the marginal and conditional effects directly, as we have done above. Often, though, it is more intuitive to visualize the predicted values of the response variable generated by the marginal and conditional effects. When we do this, we are working with marginal and conditional means, i.e. the predicted mean value of the response variable (with uncertainty) given specified values of the predictor variables. This is the most common (and probably the best) way to visualize the results of a multiple regression model.

Remember, a linear regression is literally just a mathematical equation that can be solved given the values of your predictors. The only tricky part is to keep track of the uncertainty associated with each parameter estimate and propagate it appropriately to your predictions. There are several ways to do this in R, but for now we will just consider the simplest and most graphically-oriented option: ggeffects.

The strength of ggeffects is that it is designed with visualization in mind, so it only takes a few lines of code to yield nice plots of your marginal/conditional effects. We start by generating predictions with ggpredict(), which by default generates marginal/conditional predictions for all the predictors used in your model. The output is a list of data frames, one for each variable in your model, containing a column of predictions and corresponding confidence intervals.

Extract predictions

mod.01_predictions <- ggpredict(mod.01)
mod.01_predictions
## $body_mass_g
## # Predicted values of bill_depth_mm
## 
## body_mass_g | Predicted |         95% CI
## ----------------------------------------
##        2600 |     17.20 | [16.82, 17.58]
##        3000 |     17.70 | [17.40, 18.00]
##        3600 |     18.44 | [18.25, 18.64]
##        4000 |     18.94 | [18.77, 19.11]
##        4400 |     19.44 | [19.24, 19.64]
##        5000 |     20.18 | [19.88, 20.48]
##        5400 |     20.68 | [20.29, 21.07]
##        6400 |     21.92 | [21.31, 22.54]
## 
## Adjusted for:
## * flipper_length_mm = 197.00
## *           species = Adelie
## 
## $flipper_length_mm
## # Predicted values of bill_depth_mm
## 
## flipper_length_mm | Predicted |         95% CI
## ----------------------------------------------
##               170 |     18.15 | [17.73, 18.57]
##               180 |     18.46 | [18.19, 18.73]
##               190 |     18.78 | [18.62, 18.94]
##               200 |     19.10 | [18.90, 19.30]
##               210 |     19.42 | [19.08, 19.75]
##               220 |     19.73 | [19.24, 20.22]
##               230 |     20.05 | [19.40, 20.70]
##               240 |     20.37 | [19.55, 21.19]
## 
## Adjusted for:
## * body_mass_g = 4050.00
## *     species =  Adelie
## 
## $species
## # Predicted values of bill_depth_mm
## 
## species   | Predicted |         95% CI
## --------------------------------------
## Adelie    |     19.00 | [18.83, 19.17]
## Chinstrap |     18.85 | [18.63, 19.07]
## Gentoo    |     13.07 | [12.74, 13.39]
## 
## Adjusted for:
## *       body_mass_g = 4050.00
## * flipper_length_mm =  197.00
## 
## attr(,"class")
## [1] "ggalleffects" "list"        
## attr(,"model.name")
## [1] "mod.01"

Plot marginal/conditional means

If you want to make custom plots, you can always pull these data frames out of the list and do whatever you want with them. For now, though, we will use the default plotting functions built into ggeffects, which are quite nice. Here are the marginal and conditional means inferred from our model:

plot(mod.01_predictions)
## $body_mass_g

## 
## $flipper_length_mm

## 
## $species

Excursus

Having seen the right way to visualize a multiple regression model, let’s enjoy the spectacle of the wrong way: plotting the raw data, plucking the p-values out of the model, and adding decorative asterisks. I still see this all the time in papers that I review, and it is an indication that the authors do not understand multiple regression.

Let’s say the focus of our study was the difference in bill depth across species. Adelie and Chinstrap penguins seem to have roughly the same bill depth, but Gentoo penguins seem to have shallower bills. But is this significant? Let’s dig into our model summary table.

summary(mod.01)
## 
## Call:
## lm(formula = bill_depth_mm ~ body_mass_g + flipper_length_mm + 
##     species, data = penguins)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.2561 -0.5615 -0.0444  0.5629  2.6971 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        7.723542   1.434214   5.385 1.36e-07 ***
## body_mass_g        0.001242   0.000125   9.934  < 2e-16 ***
## flipper_length_mm  0.031727   0.008702   3.646 0.000309 ***
## speciesChinstrap  -0.152278   0.135188  -1.126 0.260791    
## speciesGentoo     -5.936424   0.221499 -26.801  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8632 on 337 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.8112, Adjusted R-squared:  0.8089 
## F-statistic: 361.9 on 4 and 337 DF,  p-value: < 2.2e-16

Ah, yes, indeed the effect of species = Gentoo is significant but the effect of species = Chinstrap is not. Let’s plot some boxplots and stick those asterisks on there.

ggplot(penguins, aes(species, bill_depth_mm)) +
  geom_boxplot() +
  annotate("text", x = 3, y = 20, label = "***") +
  theme_lucid()
## Warning: Removed 2 rows containing non-finite values (`stat_boxplot()`).

Now, in this particular case, no great harm has been done. The problem, though, is that this throws away everything we learned from our model except the p-values. That’s like baking a fancy birthday cake, then throwing away everything except the candles. P-values by themselves mean very nearly nothing and should never be interpreted. Always and only interpret parameter estimates (and/or their predictions) with their corresponding uncertainty.

Baby Ellie examines her model.
Baby Ellie examines her model.

Checkpoint

My guess is that what we have covered so far was perhaps mildly uncomfortable but familiar. Some of you already have a lot of experience fitting multiple regression models and visualizing them with ggeffects. That’s good. Now you can articulate that procedure in terms of marginal and conditional effects.

You probably have questions, though. What about post-hoc tests? Interactions? Models with nonlinearities? Random effects?

Stand by.

When model parameters are not enough

As models become more complex, parameter estimates alone become hard or impossible to interpret, and the notion of marginal and conditional effects becomes more complicated (and eventually contradictory). Let’s consider two kinds of models that we use a lot in ecology: GAMs and interaction models.

GAMs

None of the variables in the penguins data set have curvy relationships, so we’ll have to cheat a bit. If we take only the Chinstrap and Adelie species and plot body mass against bill_length, ignoring species, we get a curvy shape. Again, this is just to give us something to fit a GAM to.

data <- penguins %>%
  filter(species %in% c("Adelie", "Chinstrap")) %>%
  select(bill_length_mm, body_mass_g)

ggplot(data, aes(body_mass_g, bill_length_mm)) +
  geom_point() +
  geom_smooth() +
  theme_lucid()

We’ll fit a very simple GAM in which bill length is smooth function of body mass. Please don’t take this as an example of how to fit GAMs. The goal is just to demonstrate a problem.

mod.02 <- gam(bill_length_mm ~ s(body_mass_g),
              data = data)

We tidy the model up, and right away we see the problem. Where is the \(\beta\) coefficient? How are we supposed to get a marginal effect out of this model?

mod.02_estimates <- tidy(mod.02)
mod.02_estimates
## # A tibble: 1 × 5
##   term             edf ref.df statistic   p.value
##   <chr>          <dbl>  <dbl>     <dbl>     <dbl>
## 1 s(body_mass_g)  1.91   2.41      10.4 0.0000192

If we visualize the model, the nature of the problem becomes obvious. The slope of bill length ~ body mass is not a constant; it changes depending on where you are along the x-axis of body mass. What could the marginal effect of body mass even mean in a model like this?

mod.02_predictions <- ggpredict(mod.02)
plot(mod.02_predictions)
## $body_mass_g

Hold that thought while we consider another problem.

Interactions

Now consider a slightly more serious model. Let’s say we’re reconsidering mod_01, and we really think it should include an interaction between our factor species and our continuous variables flipper length. For simplicity, we’ll drop body mass from this model.

mod.03 <- lm(bill_depth_mm ~
               flipper_length_mm +
               species +
               flipper_length_mm:species,
             data = penguins)

Tidy it up and inspect the parameter estimates. We have a \(\beta\) coefficient for flipper length, \(\beta\) coefficients for the species levels of Chinstrap and Gentoo, and then \(\beta\) coefficients for flipper length : Chinstrap and flipper length : Gentoo. Can we understand these \(\beta\) coefficients as marginal/conditional effects?

mod.03_estimates <- tidy(mod.03)
mod.03_estimates
## # A tibble: 6 × 5
##   term                               estimate std.error statistic    p.value
##   <chr>                                 <dbl>     <dbl>     <dbl>      <dbl>
## 1 (Intercept)                          7.47      2.31        3.24 0.00131   
## 2 flipper_length_mm                    0.0572    0.0121      4.72 0.00000350
## 3 speciesChinstrap                    -7.14      3.99       -1.79 0.0747    
## 4 speciesGentoo                      -15.7       3.74       -4.20 0.0000344 
## 5 flipper_length_mm:speciesChinstrap   0.0351    0.0206      1.71 0.0890    
## 6 flipper_length_mm:speciesGentoo      0.0497    0.0182      2.73 0.00667

Let’s look at them one at a time.


\(\beta_0\): Just like in mod.01, the intercept represents bill_depth when species = Adelie and flipper length = 0.

\(\beta_1\): This is the slope of bill depth ~ flipper length, but there’s a catch. In mod.01, which did not have interactions, this slope was assumed to be the same across all values of the covariates. In this interaction model, that is no longer true; this parameter is the slope of bill depth ~ flipper length only for species = Adelie. We can no longer consider this a marginal effect the way we did in mod.01.

\(\beta_2\): As in mod01, this is the change in bill depth when you go from species = Adelie to species = Chinstrap, conditional on flipper length being set to its mean. This difference is the conditional effect of species = Chinstrap on bill depth.

\(\beta_3\): Same as above, but for Gentoo.

Now things get tricky.

\(\beta_4\): This is the value that gets added to \(\beta_1\) to give you the slope of bill depth ~ flipper length given species = Chinstrap.

\(\beta_5\): This is the value that gets added to \(\beta_1\) to give you the slope of bill depth ~ flipper length given species = Gentoo.


This is problematic. We have conditional effects for species, but none of the \(\beta\) coefficients can be interpreted as a marginal effect of flipper length. What do we do?

Mixed effects

Now let’s consider one more kind of model that complicates our notion of marginal effects: a mixed-effect model. Returning to our penguin data, let’s use species as a random intercept rather than a fixed effect. Again, for simplicity flipper length will be the only continuous variable we consider.

The math of a mixed effects model looks like this:

\[ bill.depth \sim \mathcal{N}(\mu_j, \sigma_y) \]

The subscript in \(\mu_j\) tells us that \(\mu\) varies with \(j\), which in our model is species. Specifically, the intercept \(\beta_0\) gets a value \(b_{0_j}\) added to it for each level of species.

\[ \mu = (\beta_0 + b_{0_j}) + \beta_1flipper.length \]

The value added to the intercept to represent species-level variation is drawn from a normal distribution with mean = 0 and a standard deviation estimated from the data:

\[b_{0_j} \sim \mathcal{N}(0, \sigma_0)\]

Honestly, I’m shaky on the math here, but it’s close enough for our purposes. Let’s fit the model.

mod.04 <- lmer(bill_depth_mm ~ flipper_length_mm + (1 | species),
               data = penguins)

Unfortunately, we cannot use tidy() on a mixed effects model, but we can use the lovely modelsummary() function from marginaleffects to get a neat overview of our parameter estimates.

modelsummary(mod.04)
 (1)
(Intercept) 0.829
(2.415)
flipper_length_mm 0.082
(0.008)
SD (Intercept species) 3.117
SD (Observations) 0.980
Num.Obs. 342
R2 Marg. 0.110
R2 Cond. 0.920
AIC 988.6
BIC 1003.9
ICC 0.9
RMSE 0.97

What can we make of this? Well, the marginal effect of flipper length is simple enough — it’s just \(\beta_1\), the same as in mod.01. But what if we want marginal means? How do we account for the variation in the random effects term \(b_{0_j}\)? Or should we just ignore it?

ggeffects gives us a couple of options. The default behavior is to ignore the random effect and plot the predictions based only on the fixed terms. But we can also choose to incorporate the uncertainty arising from the random effect. Let’s see how this affects our results.

# Fixed only
mod.04_predictions_fixed <- ggpredict(mod.04)$flipper_length_mm %>%
  tibble() %>%
  mutate(model = "fixed")

# With random
mod.04_predictions_random <- ggpredict(mod.04, type = "re")$flipper_length_mm %>%
  tibble() %>%
  mutate(model = "random")

# Collate
mod.04_predictions <- bind_rows(mod.04_predictions_fixed, 
                                mod.04_predictions_random) %>%
  mutate(model = factor(model, levels = c("random", "fixed")))

First, notice that the fit lines for the two types of prediction are identical. This is because the line is based solely on \(\beta_1\) in both cases. What differs is the uncertainty around this estimate. When we ignore the random effect, we get a tighter confidence interval. In this case, because the random effect happens to be weak, the difference is very small, but it can be much more pronounced in models with stronger random effects.

ggplot(mod.04_predictions, aes(x, predicted, 
                               ymin = conf.low, 
                               ymax = conf.high,
                               fill = model)) +
  geom_ribbon(alpha = 0.5) +
  geom_line() +
  labs(x = "flipper_length_mm", y = "bill_depth_mm") +
  theme_lucid()

Which is the right choice? And which one can be called a “marginal mean”?

The horrible mess: defining marginal and conditional effects

We need to revisit the definitions of marginal and conditional effects in light of the models presented above. If it’s not still fresh in your mind, reread that email correspondence that I had with Andrew Heiss. The unfortunate truth is that there are at least three mutually incompatible definitions of marginal and conditional effects floating around the world of statisticians and practitioners. We can’t fix the terminology, but we can understand the different meanings that the terms can have, and how they relate to different types of models.

1. Sliders vs. switches

This is the definition I introduced with mod.01: a marginal effect is the (partial) slope on the response variable on a continuous predictor (e.g. bill_depth ~ flipper length), and a conditional effect is the change in the response variable when a categorical variable goes from its reference level (e.g. species = Adelie) to some other level (e.g. species = Gentoo). This, I think, happens to be the classical definition that real statisticians usually have in mind. For an excellent overview of this sense of marginal and conditional effects, see Andrew Heiss’ “Marginalia”.

2. Conditioning vs. averaging

This definition comes into play when your model has interactions (or random slopes). Under this definition, a marginal effect averages \(\beta\) parameters over an interacting covariate, while a conditional effect fixes the interacting covariate at a specified level. For example, in mod.03, the marginal effect of flipper length would be the average (or potentially some other summary) of the species-specific slopes, thus representing the slope expected for a hypothetical species that is the average of the three observed species. The conditional effect of flipper length would be the slope given a specific level of species, say species = Chinstrap. Thus, in that model, there would be only one marginal effect of flipper length but three conditional effects, one for each level of species.

3. Incorporating vs. ignoring random effects

Finally, in the context of mixed effect models, the terms “marginal” and “conditional” take on yet a third meaning with respect to predictions. Marginal means are based on predictions that incorporate the uncertainty arising from the random effects component of the model, while conditional means are based on predictions that use only the fixed terms of the model. Thus, conditional means will generally have tighter confidence intervals that marginal means. When it comes to interpretation, a marginal mean can be understood as the expected value for a randomly selected penguin that belongs to one of the 3 species, but you don’t know which one. This uncertainty is reflected in the wider confidence interval, which has to cover the full range of possibility comprised by Adelie, Chinstrap, and Gentoo penguins. In contrast, a conditional mean can be understood as the expected value for a hypothetical average penguin, an Adelchintoo, if you will.

Each of these definitions is, by itself, useful and legitimate. The problem, of course, is that they cannot be reconciled with each other. Whenever you use the term “marginal” or “conditional,” it is up to you to understand which of these definitions you are working with, to align this usage with the interpretation of your models, and to communicate all this clearly to your audience.

Introducing marginaleffects

Vincent Arel-Bundock appreciates the absurdity of the situation and designed an R packages to help you find your way to the marginal and/or conditional effect you need: marginaleffects. He even wrote an online book to go with the package. He is your friend. But it is still up to you to interpret your marginal/conditional effects legitimately.

Definition 1: sliders vs. switches

Remember mod.01?

mod.01
## 
## Call:
## lm(formula = bill_depth_mm ~ body_mass_g + flipper_length_mm + 
##     species, data = penguins)
## 
## Coefficients:
##       (Intercept)        body_mass_g  flipper_length_mm   speciesChinstrap  
##          7.723542           0.001242           0.031727          -0.152278  
##     speciesGentoo  
##         -5.936424

Let’s get some sliders and switches. We already did this manually, but the slopes() function from marginaleffects streamlines the process, particularly for more complex models.

mod.01_slopes <- slopes(mod.01)
mod.01_slopes
## 
##         Term        Contrast Estimate Std. Error      z Pr(>|z|)    S     2.5 %
##  body_mass_g dY/dX            0.00124   0.000125   9.93   <0.001 74.8  0.000997
##  body_mass_g dY/dX            0.00124   0.000125   9.93   <0.001 74.8  0.000997
##  body_mass_g dY/dX            0.00124   0.000125   9.93   <0.001 74.8  0.000997
##  body_mass_g dY/dX            0.00124   0.000125   9.93   <0.001 74.8  0.000997
##  body_mass_g dY/dX            0.00124   0.000125   9.93   <0.001 74.8  0.000997
##    97.5 %
##   0.00149
##   0.00149
##   0.00149
##   0.00149
##   0.00149
## --- 1358 rows omitted. See ?avg_slopes and ?print.marginaleffects --- 
##  species     Gentoo - Adelie -5.93642   0.221499 -26.80   <0.001 523.2
##  species     Gentoo - Adelie -5.93642   0.221499 -26.80   <0.001 523.2
##  species     Gentoo - Adelie -5.93642   0.221499 -26.80   <0.001 523.2
##  species     Gentoo - Adelie -5.93642   0.221499 -26.80   <0.001 523.2
##  species     Gentoo - Adelie -5.93642   0.221499 -26.80   <0.001 523.2
##      2.5 %   97.5 %
##  -6.370553 -5.50229
##  -6.370553 -5.50229
##  -6.370553 -5.50229
##  -6.370553 -5.50229
##  -6.370553 -5.50229
## Columns: rowid, term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, bill_depth_mm, body_mass_g, flipper_length_mm, species 
## Type:  response

You might be wondering how we got such a huge data frame. The answer is that slopes() assumes that you might be using a more complex model with interactions and such, so it gives you the marginal effect of each variable for each individual. In our data, the individuals are 344 penguins, and each row of our data frame is one measured penguin. Thus, slopes() provides 344 x 4 (body mass, flipper length, species = Chinstrap, species = Gentoo) rows. For us, this is superfluous, since the estimated effects are the same for all individuals. If we had interactions in our model, though, this would be important, since the effect of each variable on each penguin could depend on the values of the other variables for that penguin. Make sense? In any case, we can get rid of all that redundancy by adding the argument by = TRUE, which summarizes across individuals for each variable. We’ll also convert the output to a tibble and add the class column like we did before.

mod.01_slopes_sum <- slopes(mod.01, by = TRUE) %>%
  # for plotting convenience
  tibble() %>% 
  mutate(class = case_when(
    term %in% c("body_mass_g", "flipper_length_mm") ~ "Marginal effects",
    term == "species" ~ "Conditional effects"
  ))

mod.01_slopes_sum
## # A tibble: 4 × 10
##   term          contrast estimate std.error statistic   p.value s.value conf.low
##   <chr>         <chr>       <dbl>     <dbl>     <dbl>     <dbl>   <dbl>    <dbl>
## 1 body_mass_g   dY/dX     0.00124  0.000125      9.93 2.97e- 23   74.8   9.97e-4
## 2 flipper_leng… dY/dX     0.0317   0.00870       3.65 2.66e-  4   11.9   1.47e-2
## 3 species       Chinstr… -0.152    0.135        -1.13 2.60e-  1    1.94 -4.17e-1
## 4 species       Gentoo … -5.94     0.221       -26.8  3.13e-158  523.   -6.37e+0
## # ℹ 2 more variables: conf.high <dbl>, class <chr>

This plot should look familiar.

ggplot(mod.01_slopes_sum, 
       aes(contrast, estimate, 
           ymin = conf.low, 
           ymax = conf.high)) +
  geom_point() +
  geom_errorbar(width = 0.25) +
  facet_wrap(~class, scales = "free") +
  theme_lucid()

That was easy enough to do without the help of marginaleffects, but what happens when we turn to our mod.02, our lovely GAM?

mod.02
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## bill_length_mm ~ s(body_mass_g)
## 
## Estimated degrees of freedom:
## 1.91  total = 2.91 
## 
## GCV score: 27.31004

The slope of bill length ~ body mass varies smoothly with the value of body mass, so what is the “marginal effect”? Here, we have more than one option, and marginaleffects can help us.

First, we might want a bunch of marginal slopes at representative values of body mass. This is the default output of slopes(), and it approximates the first derivative of our GAM smooth. We could, of course, plot this manually, but here I’ll illustrate the built-in plotting function plot_slopes(), which is just a wrapper for ggplot2 (and can therefore be modified with ggplot functions, like theme_lucid()).

mod.02_slopes <- slopes(mod.02)
mod.02_slopes
## 
##         Term Estimate Std. Error    z Pr(>|z|)    S     2.5 %  97.5 %
##  body_mass_g  0.00412    0.00135 3.05  0.00232  8.8  1.47e-03 0.00677
##  body_mass_g  0.00386    0.00135 2.86  0.00429  7.9  1.21e-03 0.00651
##  body_mass_g  0.00571    0.00180 3.18  0.00147  9.4  2.19e-03 0.00924
##  body_mass_g  0.00538    0.00148 3.63  < 0.001 11.8  2.48e-03 0.00828
##  body_mass_g  0.00461    0.00134 3.44  < 0.001 10.7  1.98e-03 0.00724
## --- 209 rows omitted. See ?avg_slopes and ?print.marginaleffects --- 
##  body_mass_g  0.00278    0.00143 1.94  0.05254  4.3 -3.05e-05 0.00559
##  body_mass_g  0.00550    0.00153 3.59  < 0.001 11.6  2.50e-03 0.00850
##  body_mass_g  0.00399    0.00136 2.94  0.00329  8.2  1.33e-03 0.00665
##  body_mass_g  0.00226    0.00152 1.49  0.13601  2.9 -7.12e-04 0.00524
##  body_mass_g  0.00399    0.00136 2.94  0.00329  8.2  1.33e-03 0.00665
## Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, bill_length_mm, body_mass_g 
## Type:  response
plot_slopes(mod.02, variables = "body_mass_g", condition = "body_mass_g") +
  theme_lucid()

Alternatively, we might want to get the “overall” marginal effect by averaging the slope across all values of body mass. This would be interpreted as the expected effect of a change in body mass for a penguin of unknown body mass. This might sound like nonsense, but don’t jump to conclusions. Maybe you are considering a management intervention — like, I don’t know, supplemental feeding — that is expected to increase the body mass of the average penguin by 0.5 kg. With this average marginal effect, you could predict the average change in bill length in a population of penguins without knowing the starting body mass of any individuals. In marginaleffects, we get this “average marginal effect” by adding the by = TRUE argument, as we did before.

mod.02_slopes_sum <- slopes(mod.02, by = TRUE) %>%
  tibble()

mod.02_slopes_sum
## # A tibble: 1 × 8
##   term        estimate std.error statistic    p.value s.value conf.low conf.high
##   <chr>          <dbl>     <dbl>     <dbl>      <dbl>   <dbl>    <dbl>     <dbl>
## 1 body_mass_g  0.00406  0.000822      4.94    7.75e-7    20.3  0.00245   0.00567
ggplot(mod.02_slopes_sum, 
       aes(term, estimate, 
           ymin = conf.low, 
           ymax = conf.high)) +
  geom_point() +
  geom_errorbar(width = 0.25) +
  theme_lucid()

Finally, suppose we have a species interest in penguins weighing 3.5 kg. What is the marginal effect for that specific value of body mass? We can specify this — as well as the values of any other variables of interest — using the newdata argument with the helper function datagrid().

mod.02_slopes_conditional <- slopes(mod.02, 
                                    newdata = datagrid(body_mass_g = 3500)) %>%
  tibble()

ggplot(mod.02_slopes_conditional, 
       aes(term, estimate, 
           ymin = conf.low, 
           ymax = conf.high)) +
  geom_point() +
  geom_errorbar(width = 0.25) +
  theme_lucid()

Definition 2: conditioning vs. averaging

Now let’s turn to out interaction model. In mod.03, the slope of bill depth ~ flipper length varies across penguin species.

mod.03
## 
## Call:
## lm(formula = bill_depth_mm ~ flipper_length_mm + species + flipper_length_mm:species, 
##     data = penguins)
## 
## Coefficients:
##                        (Intercept)                   flipper_length_mm  
##                            7.47494                             0.05723  
##                   speciesChinstrap                       speciesGentoo  
##                           -7.14033                           -15.71179  
## flipper_length_mm:speciesChinstrap     flipper_length_mm:speciesGentoo  
##                            0.03513                             0.04968

In this context, the marginal effect of flipper length on bill depth means averaging the effect of flipper length across species. As you might expect, we can do this with the by = TRUE argument.

mod.03_marginalslopes <- slopes(mod.03, variables = "flipper_length_mm", by = TRUE) %>%
  tibble()

mod.03_marginalslopes
## # A tibble: 1 × 8
##   term          estimate std.error statistic  p.value s.value conf.low conf.high
##   <chr>            <dbl>     <dbl>     <dbl>    <dbl>   <dbl>    <dbl>     <dbl>
## 1 flipper_leng…   0.0821   0.00796      10.3 6.62e-25    80.3   0.0665    0.0977
ggplot(mod.03_marginalslopes, 
       aes(term, estimate, 
           ymin = conf.low, 
           ymax = conf.high)) +
  geom_point() +
  geom_errorbar(width = 0.25) +
  theme_lucid()

And we can plot the marginal predictions using the handy plot_predictions() function.

plot_predictions(mod.03, condition = "flipper_length_mm") +
  theme_lucid()

The conditional effect of flipper length, on the other hand, is the effect of flipper length for each species, or for a select species. We can get this by adding the argument condition = species, and plot_slopes gives us a lovely visualization.

mod.03_conditionalslopes <- slopes(mod.03, variables = "flipper_length_mm", by = "species") %>%
  tibble()

mod.03_conditionalslopes
## # A tibble: 3 × 13
##   term   contrast species estimate std.error statistic  p.value s.value conf.low
##   <chr>  <chr>    <fct>      <dbl>     <dbl>     <dbl>    <dbl>   <dbl>    <dbl>
## 1 flipp… mean(dY… Adelie    0.0572    0.0121      4.72 2.38e- 6    18.7   0.0335
## 2 flipp… mean(dY… Chinst…   0.0924    0.0166      5.55 2.89e- 8    25.0   0.0597
## 3 flipp… mean(dY… Gentoo    0.107     0.0136      7.88 3.21e-15    48.1   0.0803
## # ℹ 4 more variables: conf.high <dbl>, predicted_lo <dbl>, predicted_hi <dbl>,
## #   predicted <dbl>
plot_slopes(mod.03, variables = "flipper_length_mm", condition = "species") + 
  theme_lucid()

Now let’s plot the conditional predictions. Maybe our main interest is in the conditional slope of bill depth ~ flipper length across species.

plot_predictions(mod.03, condition = c("flipper_length_mm", "species")) +
  theme_lucid()

Alternatively, maybe we want to focus on the conditional mean of bill depth across species while marginalizing (i.e. averaging) over flipper length. Under definition 2, you often marginalize and condition at the same time to get the estimate you want.

plot_predictions(mod.03, condition = c("species")) +
  theme_lucid()

Definition 3: incorporating vs. ignoring random effects

Finally, let’s turn our attention to question of random effects raised in mod.04.

mod.04
## Linear mixed model fit by REML ['lmerMod']
## Formula: bill_depth_mm ~ flipper_length_mm + (1 | species)
##    Data: penguins
## REML criterion at convergence: 980.5509
## Random effects:
##  Groups   Name        Std.Dev.
##  species  (Intercept) 3.117   
##  Residual             0.980   
## Number of obs: 342, groups:  species, 3
## Fixed Effects:
##       (Intercept)  flipper_length_mm  
##            0.8291             0.0817

The conditional prediction of bill depth ~ flipper length is based only on the fixed effect of flipper_length, ignoring the random effect of species. We get this by specifying re.form = NA.

plot_predictions(mod.04, 
                 condition = "flipper_length_mm",
                 re.form = NA) +
  theme_lucid()

To get the marginal prediction, incorporating the uncertainty arising from the random effect, we set re.form = NULL.

plot_predictions(mod.04, 
                 condition = "flipper_length_mm",
                 re.form = NULL) +
  theme_lucid()

Things that didn’t fit

I ran out of time to talk about two very important topics:

  1. Post-hoc contrasts
  2. Marginal and conditional effects in Bayesian models

I encourage you to read on your own or come talk to me.